Skip to content

feat[gpu]: arrow device array stream support#8483

Merged
0ax1 merged 25 commits into
developfrom
ad/arrow-device-array-stream
Jun 19, 2026
Merged

feat[gpu]: arrow device array stream support#8483
0ax1 merged 25 commits into
developfrom
ad/arrow-device-array-stream

Conversation

@0ax1

@0ax1 0ax1 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Adds Arrow device stream support which is exercised and tested through the cuDF harness.

0ax1 added 9 commits June 17, 2026 18:46
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
This reverts commit 52952d2.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 added the changelog/feature A new feature label Jun 17, 2026
@codspeed-hq

codspeed-hq Bot commented Jun 17, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 10.99%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
❌ 7 regressed benchmarks
✅ 1572 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation decompress_rd[f64, (10000, 0.01)] 108.7 µs 139.1 µs -21.89%
Simulation decompress_rd[f64, (10000, 0.1)] 109 µs 139.5 µs -21.85%
Simulation decompress_rd[f64, (10000, 0.0)] 108.7 µs 139.1 µs -21.83%
Simulation decompress_rd[f32, (100000, 0.0)] 496 µs 583.8 µs -15.05%
Simulation decompress_rd[f32, (10000, 0.1)] 78.1 µs 91.2 µs -14.43%
Simulation decompress_rd[f32, (10000, 0.01)] 78.1 µs 91 µs -14.2%
Simulation decompress_rd[f32, (10000, 0.0)] 78.5 µs 91.2 µs -13.91%
Simulation chunked_varbinview_opt_canonical_into[(1000, 10)] 206.8 µs 170.2 µs +21.45%
Simulation chunked_varbinview_into_canonical[(100, 100)] 307.1 µs 272.5 µs +12.71%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ad/arrow-device-array-stream (f1a8999) with develop (35e4d72)

Open in CodSpeed

0ax1 added 5 commits June 17, 2026 21:28
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 force-pushed the ad/arrow-device-array-stream branch from 3ae0d00 to 07926a0 Compare June 18, 2026 14:37
0ax1 added 5 commits June 18, 2026 14:59
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@vortex-data vortex-data deleted a comment from github-actions Bot Jun 18, 2026
0ax1 added 5 commits June 18, 2026 16:38
The Arrow C device array stream export drove the Vortex stream on a private
CurrentThreadRuntime, but a partition scan spawns its decode work onto the
session's runtime (vortex-ffi's RUNTIME). Nothing ever drove that runtime
during streaming, so the first get_next on a real partition deadlocked
waiting on tasks that never ran. The existing tests only exercise an inert
in-memory stream, so they never hit it.

Thread the session's runtime through export_device_array_stream and drive
the stream and per-array exports on it, removing the private runtime and
worker pool. Expose vortex_ffi::runtime() so layered FFI crates can pass the
same runtime the partition's scan spawns onto.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
The device stream derives its schema from the first array and rejects any
later array whose Arrow schema differs, which is required by the Arrow C
stream contract but means a stream whose chunks vary their encoding (a
dictionary-encoded chunk among plain chunks) fails mid-stream. Document this
on the trait, note that an empty stream reports a dtype-derived schema that
can differ from a non-empty run, and sharpen the mismatch error to name the
cause.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Add a shared ArrowDeviceArray::empty() constructor and build the end-of-stream
marker from it, replacing the hand-rolled struct literal. The stream tests now
call the module-level release_schema/release_device_array helpers instead of
redefining byte-for-byte copies, and drop the duplicate empty_device_array
placeholder in favor of ArrowDeviceArray::empty().

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Several doc and line comments added for the Arrow device array stream exceeded
the 100-column limit. Wrap them; no behavior change.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 requested review from myrrc and robert3005 June 19, 2026 10:20
@0ax1 0ax1 marked this pull request as ready for review June 19, 2026 10:24
@0ax1 0ax1 requested a review from a team June 19, 2026 10:24

@myrrc myrrc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, but this PR would benefit from adding C-side tests.

Comment thread vortex-ffi/src/lib.rs Outdated
Comment thread vortex-cuda/ffi/src/lib.rs
Comment thread vortex-cuda/src/arrow/mod.rs Outdated
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Comment thread vortex-cuda/src/arrow/mod.rs
@0ax1 0ax1 merged commit 812c277 into develop Jun 19, 2026
85 of 87 checks passed
@0ax1 0ax1 deleted the ad/arrow-device-array-stream branch June 19, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants